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Abstract 

We use traveling-wave theory to derive expressions for the rate of accumulation of 
deleterious mutations under Muller's ratchet and the speed of adaptation under pos- 
itive selection in asexual populations. Traveling-wave theory is a semi-deterministic 
description of an evolving population, where the bulk of the population is modeled 
using deterministic equations, but the class of the highest-fitness genotypes, whose 
evolution over time determines loss or gain of fitness in the population, is given 
proper stochastic treatment. We derive improved methods to model the highest- 
fitness class (the stochastic edge) for both Muller's ratchet and adaptive evolution, 
and calculate analytic correction terms that compensate for inaccuracies which arise 
when treating discrete fitness classes as a continuum. We show that traveling wave 
theory makes excellent predictions for the rate of mutation accumulation in the case 
of Muller's ratchet, and makes good predictions for the speed of adaptation in a 
very broad parameter range. We predict the adaptation rate to grow logarithmically 
in the population size until the population size is extremely large. 
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1 Introduction 



"I was observing tlie motion of a boat wliich was rapidly drawn along a 
narrow channel by a pair of horses, when the boat suddenly stopped — not so 
the mass of water in the channel which it had put in motion; it accumulated 
round the prow of the vessel in a state of violent agitation, then suddenly 
leaving it behind, rolled forward with great velocity, assuming the form of a 
large solitary elevation, a rounded, smooth and well-defined heap of water, 
which continued its course alo ng the channel apparently without change of 
form or diminution of speed." (iRusselll . Il845l ) 



One of the fundamental models of population genetics is that of a finite, 
asexually reproducing population of genomes consisting of a large number 
of sites with multiplicative contribution to the total fitness of the genome. 
This model has been studied for decades, and has presented substantial chal- 
lenges to researchers trying to solve it analytically. Even in its most basic 
formulation, where each site is under exactly the same selective pressure, the 
model has not been fully solved to this day. For the special case of v anishing 
back mutations, the mo del reduces to the problem of MuUer's ratchet (iMullerl . 
1964 : iFelsensteinl . 11974) . A tremendous amount of research effort has been di- 



rected at this problem (IHaighl. Il978l: iPamilo et al.l. 119871: IStephan et al.l. Il993 



Higgs and Woodcockl . Il995l : iGordo and Charlesworthl . l2000a| ]bl: iRouzine et al. 



20031 ). Other special cases of this model are mut ation-selection balance whe n 
the forward and back mutation rates are equal ( Woodcock and Higgs . [l996il . 



and the speed of adaptation under various condit i ons ( Tsimring et al 



Kessler et al.l . 119971 : iGerrish and Lenskil . Il998l : lOrrl . l2000l : iRouzine et al. 
Wilkd . I2OO4I ). 



19961 : 



20031: 



In 1996. lTsimring et al. I pioneered a new approach to studying the multiplica- 
tive multi-site model. They described the evolving population as a localized 
traveling wave in fitness space, using partial differential equations developed 
to describe wave-like phenomena in physical systems. A traveling wave is a 
localized profile traveling at near-constant speed and shape (physicists refer 
to such phenomena also as solitary waves). We can envision a population as 
a traveling wave of the distribution of the mutation number over genomes if 
the relative mutant frequencies in the population stay approximately constant 
while the population shifts as a whole. For example, a population may have 
specific abundances of sequences at one, two, three, or more mutations away 
from the least loaded class at all times, but the least-loaded class moves at 
constant speed by one mutation every ten generations. 



Encouraging results by iTsimring et al.l (119961 ) were based on two strong ap- 
proximations. Firstly, all fitness classes, including the best-fit class, were de- 
scribed deterministically, neglecting random effects due to finite population 
size. Finite population size was introduced into the problem as a cutoff of the 
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effect of selection at tlie liigfi-fitness ed ge, wfien tfie size of a class becomes 
less than one copy of a genome. Secondly, iTsimring et al. fll996h approximated 
the traveling wave profile with a function cont i nuous in fitness (or mutation 
load). In these approximations, iTsimring et al.l (119961 ) demonstrated the exis- 
tence of a continuous set of waves with different speeds. The cutoff condition 
determined the choice of a specific solution and the dependence of the speed 
on the population size. 



Rouzine et al.l (120031 ) confirmed the qualitative conclusions by iTsimring et al. 
( 1l996l ) and refined their quantitative results in two ways, by taking into ac- 
count the random effects acting on the smallest, best-fit class, and showing 
that, in a broad parameter range, approximating the logarithm of the wave 
profile as a smooth function of the fitness is a much better approximation than 
approximating the wave profile itself as a smooth function. It was shown that 
the substitution rate increases logarithmically with the population size, un- 
til the deterministic single-site limit is reached at extremely large population 
sizes and the theory breaks down. 



The p urpose of the present paper is to show that the results by iRouzine et al. 
( 12003! ) can still be improved regarding both the treatment of the high-fitness 
edge and the deterministic part of the fitness distribution. Because the original 
work was presented in an extremely condensed format, we will here re-derive 
the general theory in detail. Then, we will present improved treatments of 
the stochastic edge that lead to accurate predictions for the substitution rate 
defined as the average gain in beneficial alleles per genome per generation. 
We consider in detail two opposite parametric limits, Muller's ratchet (when 
beneficial mutation events are not important) and adaptive evolution (when 
deleterious mutations are not important). 



Because the range of validity of our approach caused some confusion in the 
literature, we discuss it in detail in the main text and Appendix. Briefly, both 
in Muller's ratchet and in the adaptation regime, we assume that the total 
number of sites is large, and the selection coefficient s is small. The population 
size should be sufficiently large, so that the difference in the mutational load 
between the least-loaded and average genomes is much larger than 1. In other 
words, a large number of sites are polymorphic at any time. For the adaptive 
evolution, the condition corresponds to the average substitution rate V being 
much larger than s/ ln{V/Uh), where f/b is the beneficial mutation rate, or pop- 
ulation sizes being much larger than ^s/Uj^/ \n{V/Ub). We also assume that V 
is much larger than Ub- In smaller populations, adaptation occurs by isolated 
selective sweeps at different sites, and one-site models a pply. Two-site mod- 



f 

els of adaptation , such as the clonal interference theory (IGerrish and Lenski 



1998; Orr 



2000l : IWilkd . |2004| ). can be used to describe the narrow transi- 



tional interval in N. Further, if the deleterious mutation rate per genome is 
much larger than s, and the average population fitness is sufficiently high, an 
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additional broad interval of appears, where deleterious mutations accumu- 
late (Muller's ratchet). Using numeric simulations and analytic estimates, we 
demonstrate good accuracy of our results in a very broad parameter range rel- 
evant for various asexual organisms, including asexual RNA and DNA viruses, 
yeast, some plants, and fish. 

The manuscript is organized as follows. In Section 2, we describe the model 
and the general method to derive evolutionary dynamics in terms of the fitness 
distribution. In Section 3 and 4, we consider in detail two particular cases, 
Muller's ratchet and the process of adaptation, respectively, and test analytic 
results with computer simulations. In Section 4, we discuss our findings. 



2 Traveling-wave theory 



2.1 Model assumptions 



We consider a multi-site model of L sites, where each site can be in two 
states, i.e., carry one of two alleles, either advantageous or deleterious. The 
deleterious allele reduces the overall fitness of a genome by a factor of 1 — s, 
where s -C 1 is the selective disadvantage per site. We assume that there 
are no biological interactions between sites (epistasis), so that the fitness of a 
genome with k deleterious alleles (mutational load) is (1 — s)^ ~ e"'^*. We refer 
to the frequency of sequences with mutational load k in the population as /fc, 
and write the population-average fitness as w^v = J2k fk- We introduce 
fcav, which is mutational load for which a sequence's fitness is exactly equal 
to the population mean fitness, as given by = e'^'^'" . For small s, /Cav is 
approximately the average mutational load in the population, fcav ~ ^fk- 
We denote the mutational load of the best-fit sequence in the population as /cq. 
Note that in general /cq 7^ 0, that is, the best- fit sequence in the population 
is not the sequence with the overall highest possible fitness. 



We assume that an allele can mutate into an opposite allele with a small 
probability yU. For the sake of simplicity of the derivation, we assume that 
the mutation rate is low, so that there is, at most, one mutation per genome 
per generation. The case when multiple mutations per genome are frequent 
takes place at very large population sizes when the avera ge substitution rate is 
high, larger than one new beneficial allele per generation. iRouzine et al.l (120031 ) 
considered this more general case and showed that there is no essential change 
in the final expression for the substitution rate. Thus, for finite population size, 
the assumption of a single mutation per genome per round of replication is 
not limiting (see also the next subsection). 
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In r e al Ken omes, s varies between sites. Moreover, iGillespid (119831 . Il99ll ) and 



OrrI (120031 ) argued that the distribution of s differs between sites with beneficial 



and deleterious alleles. A viable genome represents a highly-fit, non-random 
selection of alleles, so that deleterious mutations should generally have larger 
effects than beneficial mutations. For the same reason, these authors predicted 
that the effective distribution of s for beneficial alleles should have a univer- 
sal exponential form. In the present work, we do not consider variation in 
s. Instead, we use a simplified model including only those sites into the to- 
tal number of sites L — with either deleterious and beneficial alleles — whose 
selection coefficient is on the order of the same typical value s, and approxi- 
mate all selection coefficients at these sites with a constant s. The choice of s 
and, hence, of the set of included sites depends on the time scale of evolution 
under consideration. Strongly deleterious mutations with effects much larger 
than s are cleared rapidly from a population. Strongly beneficial mutations are 
fixed at the e arly stages of evolution. No te that in the "clonal interference" ap- 
proximation (iGerrish and Lenskil . Il998l ). which considers competition between 
two beneficial clones emerging at two sites, variation of s must be taken into 
account to make continuous adaptation possible. In contrast, in the present 
theory, which allows new beneficial clones to grow inside of already existing 
clones, the importance of variation in s is less obvious. We hope to address 
this matter elsewhere. 



We discuss the validity of our approach in detail for the limits of Muller's 
ratchet and adaptation and give a summary of the central simplifications — 
numbered 1 through 6 and referenced throughout the text — in the Appendix. 
Note that we can verify the validity of the various assumptions only after the 
fact, once we have obtained our final results. All simplifications are asymptot- 
ically exact, i.e., are based on the existence of small dimensionless parameters. 
The most limiting requirements are that the high-fitness tail of the distribu- 
tion is long, and that the distribution is far from the unloaded and fully loaded 
(possible best-fit and less-fit) genomes. 



2.2 General approach 



The first idea underlying the approach of "solitary wave" is to classify all 
genomes in a population according to their fitness (mutational load), regardless 
of specific locations of deleterious alleles in a genome, and focus on evolution of 
fitness classes. The second idea is to describe evolution of most fitness classes 
deterministically. To take into account the effects of finite population size, 
such as genetic drift and randomness of mutation events, only one class with 
the highest fitness is described stochastically using the standard two-allele 
diffusion approach. The best-fit class is considered a minority "allele" in a 
population, and all other sequences are considered the majority "allele". The 
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reason why stochastic effects can be neglected aheady for the next-to-best 
class is that, in a broad parameter range, the fitness distribution decreases 
exponentially towards the stochastic edge. Hence, the next-to-best class is 
large enough to neglect stochastic effects, especially in the adaptation regime 
(see estimates in Sections 3 and 4). 



We note that neglecting stochastic effects completely and considering the limit 
of infinite population is not correct. As we show below, stochastic processes 
acting on the best-fit class limit the overall evolution rate and make it depen- 
dent on the population size. Even for a modest number of sites [L = 15-20), 
the substitution rate is predicted to reach the true dete rministic limit only in 



astronomically large populations not found i n nature (iTsimring et al.l . 11996 



Rouzine et al.l . l2003l : ^esai and Fisherl . |2007| ). For the same reason, the as- 



sumption of one mutatio n per genome we mad e in our model is not a limiting 
factor and, as shown by iRouzine et al.l (120031 ). does not change much in the 
final expression for the average substitution rate. 



The formal procedure consists of several steps, as follows. 



(i) The frequencies of all fitness classes excluding the best-fit class are de- 
scribed by a deterministic balance equation. 

(ii) The equation is shown to have a traveling wave solution with an arbitrary 
speed (the average substitution rate). 

(iii) The leading front of the wave (high-fitness tail of the distribution) is 
shown to end abruptly at a point, expressed in terms of the wave speed. 

(iv) The difference between the values of the fitness distribution at the center 
and the edge is expressed in terms of the wave speed. 

(v) The value at the center is found from the normalization condition. 

(vi) Because the biological justification for the lack of genomes beyond the 
high-fitness cutoff is finite size of population, the cutoff point is identified 
with the stochastic edge. 

(vii) To determine the wave speed as a function of the population size, the 
average frequency of the least-loaded class is estimated from the classical dif- 
fusion result and matched to the deterministic cutoff value. 
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2.3 Equation for the deterministic part of the fitness distribution 



We proceed with the first step. On the basis of our model assumptions and 
neglecting multiple mutation events per generation per genome, we can write 
the deterministic time evolution of the frequency fk{t) of genomes with mu- 
tational load k as 

fk{t + 1) = ^7- ( f^{L - k + 

+ e-^^'+'^fiik + + e-^'il - /iL)/,(t)) , (1) 

where k runs from to L and f-i(t) = fL+i{t) = 0. By definition, J2k fk{t) = 1 
for all times t. Now we introduce the total per-genome mutation rate U = fiL, 
and the ratio of beneficial mutation rate per genome to the total mutation 
rate, = fik/U = k/L. Inserting these expressions into Eq. ([T]), expressing 
Wav in terms of /Cav, expanding e~'^^ ^ 1 — sx (which we are allowed to do 
under the condition that s\k — kg^-y\ <^ 1, Simplification 1), and neglecting all 
terms proportional to sU, we arrive at: 

fk{t + 1) = U{1 - ak-i)fk-i{t) + Uak+ifk+i{t) 

+ [l-U -s{k-k^,)]fk{t). (2) 

As mentioned before, we consider the case when the traveling wave is far 
from the sequence with the highest possible fitness, = 0, as given by the 
condition |A; — fcavl ^ ^av Therefore, ak depends only slowly (linearly) on /c, 
we are allowed to replace auhy a = au^^ (Simplification 2), and find 

fk{t + 1) - fk{t) = U{1 - a)/fc-i(t) + Uafk+i{t) -[U + s{k - k,,)]fk{t) . (3) 



Our goal is to turn this expression into a continuous differential equation. As 
the mutation rates are low, fk{t) evolves very slowly in time, and we can write 
fkit + 1) ^ fkit) + dfkit)/dt (Simplification 3). 

We need to be more careful when making a continuous approximation for 
fk{t) as a function of k. As we show below, fk{t) changes rapidly with k in 
the important high fitness tail. Therefore, the Taylor expansion of fk{t) is not 
justified. However, the logarithm of fk{t) is a smooth function of in a broad 
parameter range, provided the "lead" of the distribution is large (Simplifica- 
tion 4). [The lead is the difference in number of mut ations from the populatio n 
center to the least-loaded class in the population (iDesai and Fisheii . I2OO7I I] 
Therefor e, a better appr o ximat ion, which represents an improvement on the 
work by Tsimring et al.l ( 1996 ). is to do the Taylor expansion on In/^ and 
write fk+i(t) = fk(t) exp[d\ia fk{t)/dk]. With these approximations, and af- 
ter introducing a rescaled time dr = Udt and a rescaled selection coefficient 
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(T = s/U, we find 

d\nfk{t)/dT = (1 - a)e-^i'^^'=W/^'' + ae^'^/^W/^'^ - a{k - K,) - 1 . (4) 

This nonlinear partial differential equation describes the deterministic move- 
ment of the population in fitness space over time. 

(Note that, for sufficiently large iV, the assumptions of Simplifications 1 and 
3 may be violated, so that technically, we can neither expand fitness in k 
nor replace di s crete time with continuous time in this regime. Nevertheless, 



Rouzine et al.l (120031 ) showed that the continuous equation for the fitness dis- 
tribution, Eq. (jlj), can be derived in a more general way, without assuming 
din fk{t)/dt ^ 1, nor expanding fitness in k, nor neglecting multiple muta- 
tions per genome per generation [see the transition from Eq. (1) to Eq. (11) 
in the Supplementary Text of the quoted work]. In the present work, we use 
these approximations only to simplify our derivation.) 

In the equation above, a and a depend, strictly speaking, on time. The de- 
pendence is, however, very weak in the limit of a large genome size L. In the 
remainder of this paper, we shall assume that the observation time is very 
small compared to the time in which k^v changes by L and, thus, a and a can 
be considered constant. At the same time, because the distribution is very far 
from the class without deleterious alleles, \k — fcavl ^ ^av, a considerable shift 
of the distribution can occur during the observation time. 



2.4 Traveling wave solution 



A broad class of partial differential equations affords solutions in the form of 
traveling waves. A wave is a fixed shape, described by a function (f){x), that 
moves through space without changing its form. In the case of an evolving 
population, space corresponds to the mutational load, k. A traveling wave 
solution to Eq. (jlj) can be expressed as the movement of the center of the 
population over time, fcav(T), and the shape of the wave around this center, 
which we write as 0[A;— A;av(T)]. We therefore make the ansatz (Simplification 5) 
that 

ln/fc(t) = 0[A;-A;av(r)] (5) 
After inserting Eq. (jSj) into Eq. and writing x = k — kg^-^lr), we obtain 

ax=il- a)e-^'(") + ae'^'^'^ + v(P\x) - 1 , (6) 

where 0'(x) = d(f){x)/dx, and we have introduced the wave velocity v, that is, 
the speed of movement of the population center A;av over time, v = dk-^^ij) / dr . 

For a given wave velocity f , Eq. ([6]) and 0(0) fully specify the shape of the 
wave (p^x). Although it is not possible to solve Eq. ([HD in the general form. 
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analytically, below we will be able to find the velocity of the wave. The first 
step is to derive 0(0) from the normalization condition. 



2.5 Normalization condition. Width and speed 



The validity of the approach by iRouzine et al.l (120031 ) requires that the log- 



arithm of the fitness distribution, 0(x), can be approximated with a smooth 
function of x. In other words, the characteristic scale in x given by the length 
of the high- fitness tail is much larger than unit. The condition is always met 
when population is evolving at many sites at a time. Because the fitness dis- 
tribution itself, exp[0(x)], is not replaced with a smooth function of x, it does 
not have to be broad: Whether the half-width of the wave is small or large 
is irrelevant for the validity of the approach. In the treatment of adaptation 
limit (Section 4), the wave can be either broad or narrow, depending on the 
range of population sizes. The two cases, however, have different normalization 
conditions, as follows. 

If the wave is narrow, Var[/c] = Var[x] -C 1, most of the distribution is localized 
at A; ~ fcav, and from the normalization sum I]fc=o fk{t) we have 

0(0) ^ 0. (7) 

(The exception is the case when fc^v is nearly half-integer, and the two adjacent 
classes have comparable sizes.) 

If the wave is broad, Var[A;] ^ 1, the normalization condition is less trivial. 
The normalization sum Yl,k=o fk{t) can be replaced with an integral, / e'^^^^dx. 
As we will see momentarily, the main contribution to the integral comes from 
the interval of x such that |x| is much larger than 1 but much smaller than 
the high-fitness tail length, so that |0'(^)l 1- Hence, we can expand the 
exponential terms in Eq. ([6]) to the first order (Simplification 6), and find 

(f) [x) -- 



l-2a-v 



Clearly, |0'(a;)| <^ 1 when ^ {1 — 2a — v)/a, and therefore this latter 
condition determines the validity of Eq. ([8]). Integrating in x, taking into 
account the normalization condition, we obtain the expression for 0(x) near 
the wave center 



„2 



a ax , , 

0x = nJ — -. 9 
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In particular, the desired expression for 0(0) has a form 

^W = '°^ 2.(l-2„-„) -' (^°) 

Note that 0(x) was defined as 0(x) = In fk(t). Therefore, Eq. imphes that 
the distribution of genomes in the vicinity of /cav is approximately Gaussian, 
with variance 

Var[k] = {l-2a-v)/a. (11) 

Because the variance and the argument of the square root in Eq. (fTOl) must 
be positive, the wave velocity has to satisfy f < 1 — 2a. In other words, 
if deleterious mutations accumulate, the rate of their accumulation cannot 
exceed 1 — 2a if time is measured in units of U. Also, as we already stated, the 
Gaussian expression for the central part of the wave is valid if the width of the 
wave ^J'VsiT[k] is much larger than 1. For moderate or small wave speeds, |f | ~ 1 
or |f I <^ 1, this condition implies that a is much smaller than 1. Whether 
the wave is actually narrow or broad, given the population size and other 
parameters, will be discussed below for Muller's ratchet and the adaptation 
regime. 

Eq. ( fTTI) . which is known as Fisher Fundamental Theorem, links the width 

of the population's mutant distribution, ^JVai\k\, to the velocity v at which 
the wave is traveling. We emphasize that the validity of this expression is not 
restricted to the case of broad fitness distribution and can be derived directly 
from Eq. ([3]) (Appendix) even if Var[/c] <^ 1. Qualitatively, the theorem states 
that, the broader the wave, the larger the characteristic difference in fitness 
between two representative genomes, and the stronger the effect of positive 
selection on the substitution rate. 



2. 6 High- fitness edge of distribution 

In the previous subsections, we have derived a continuous wave equation that 
gives a deterministic description of how the bulk of the population moves over 
time in fitness space. However, the speed of the wave remains undefined. As 
it turns out, the behavior of a small class of best-fit genomes determines the 
speed. 

Below we show that the deterministic wave has a cutoff in the high-fitness tail 
of the wave whose position is defined by the wave speed and model parame- 
ters. The biological reason behind the deterministic cutoff are stochastic fac- 
tors acting on finite populations. Highly-fit genomes are either gradually lost 
(Muller's ratchet) or gradually gained (adaptation) by a population. Thus, 
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the deterministic cutoff is identified with the "stochastic edge" of the wave 
that gradually advances or retreats (Fig. [1]). The rate of the loss or gain of 
the least-loaded class, which depends on the population size, ultimately de- 
termines the speed of the wave. To obtain the desired condition for the wave 
speed, in the next subsection, we will match the deterministic description to 
stochastic behavior of the least-loaded class. 

We cannot solve Eq. explicitly. Fortunately, we can extract the location 
of the stochastic edge by interpreting Eq. ([6]) as an equation that describes 
a function rather than a function If the function has an 

absolute minimum, then this minimum implies that has a deterministic 
cutoff, and thus the minimum must coincide with the stochastic edge (Fig. [2]). 

Therefore, consider the function x{(j)') = [(1 — a)e~'^' + ae'^' + v(j)' — l]/o". 
Clearly, x{(j)') oo for cf)' ±oo, as long asO<a;<lora = and f > 0. 
To locate any minima, we calculate the derivative dx{(j)')/d(j)' , and find 

= [-(1 - a)e-^' + ae'^' +v]/a = 0. (12) 

We solve the resulting quadratic equation in e'^' and arrive at 

\u. (13) 



^ 2a 



V + Jv'^ + 4a(l - a) 



The alternative solution, with a minus sign in front of the square root, is not 
possible for real- valued cp'. Thus, the derivative of x((/)') has only a single root, 
which must correspond to an absolute minimum. The value of x{(j)') at this 
minimum, which we denote by Xq, follows as 

Xq = (1 — 2au — v\nu — v) , (14) 

a 

where we have made use of the identity {l — a)/u = au + v. For any mutation 
classes k corresponding to x < a;o, we have = 0. For the continuous approach 
to work, we need |xo| ^1 (Simplification 4). 



2. 7 Difference between fitness distribution at center and edge 



Our eventual goal is to find an expression that relates the speed of the wave, 
V, to the population size A^, mutation rate U, selective coefficient s, and the 
fraction of beneficial mutations a. It turns out that we can achieve this goal 
by considering the difference 0(0) — (j){xo). By evaluating this difference in two 
alternative ways, we end up with the desired relation. 
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First, we note that 

(f){0) - (f){xo) = / (j)'{x)dx. (15) 

We can evaluate the integral on the right-hand side by making the subsitution 
X = integrating by parts, and using 4>'{xo) = Inu and 0'(O) = 0. (The 

former condition stems from the definition of u, while the latter condition is 
a consequence of Eq. ([H]); see also Fig. O) We arrive at 

(j){0)-(j){xo) = -xolnu- x{(j)')d(f)' . (16) 

J\nu 

We can evaluate the remaining integral on the right-hand side by integrating 
over Eq. ([6]) . We obtain, after inserting Eq. (fl^ into the first term on the 
right-hand side of Eq. (fT6|l , 

0(0) - 0(xo) = ^(^1 - 2a - |[ln^(eM) + 1] - 2au\nuJ , (17) 

where e is Euler's constant, ln(e) = 1. 

We have now calculated 0(0) — 0(xo) by integrating over 0'(x). Alternatively, 
we can calculate this difference by evaluating 0(x) directly at and at Xq. 0(0) 
follows from Eq. ffTOl) if the argument of the logarithm is much smaller than 
1 (large half- width of the wave), and 0(0) = in the opposite limit (small 
half-width). The quantity 0(xo) represents the logarithm of the genome fre- 
quency right at the stochastic edge. Since this value is dominated by stochastic 
effects, we cannot evaluate 0(xo) on the basis of the deterministic theory we 
have developed so far. Therefore, we proceed as follows. We give the genome 
frequency at the stochastic edge proper stochastic treatment and base our 
estimate of the expected genome frequency at the stochastic edge on these 
probabilistic arguments. 



2.8 Stochastic treatment of the least-loaded class 



The deterministic derivation in the previous subsections demonstrates the 
existence of a continuous set of solitary waves with various speeds, i.e., average 
substitution rates. Now, to choose the correct solution and determine the 
actual substitution rate, we have to take into account stochastic effects acting 
on the least-loaded class, k = k^. At this point, finite population size enters 
the scene. We describe the dynamics of the le a st-loa d ed class using the one - 



sit e, two- a llele model (see reviews in iKimural (119641 ). iRouzine et al.l (120011 ) 



or 



EwensI tooj )). The frequency of the least-loaded class is analogous to the 



concentration of the better-fit allele. To justify the deterministic treatment of 
the next-loaded class (and other classes) and estimate the error introduced by 



12 



that approximation, we will estimate its size in the limits of adaptation and 
Muller's ratchet in Sections 3 and 4, respectively. 

The accurate treatment of the stochastic dynamics of least-loaded class, fko{t), 
requires the use of a diffusion equation that includes both deterministic fac- 
tors (selection and mutation) and random genetic drift. For the purpose of 
the present work, a good accuracy can be ensured by a simpler approach, as 
follows. 



A well-known fact following from the diffusion equation is that deterministic 
factors dominate when the frequency of a mi nority allele in a population (in 



our case, fkn(t)) is rnuch la rger than 1/(A^| 5*1 ) (IMaynard Smithl . Il97ll : iBarton 



19951 : iRouzine et al.l . l200ll ): in the opposite case, random drift rules, and the 
minority is most likely to be lost. Here expdS"!) is the fitness advantage of the 
better-fit allele. If the best-fit class is much larger than that value, we can use 
the deterministic equation, Eq. ([3]), which yields 

dfkJdt = Mit)-Sh,it) (18) 

with M{t) = Uafko+i(t) and S = U + s(A;o — ^av)- The parameter S represents 
the effective coefficient of selection against the best-fit class in the popula- 
tion. It consists of two parts: the positive part U due to mutations removing 
genomes from the class, and the negative part due to the fitness difference 
between the least-loaded sequence and the average genome, s(A;o — ^av)- 

As we show below, S is positive and M is negligible in the ratchet limit, 
f > 0, and negative in the adaptation limit, v < 0, where M{t) is more 
pronounced but still relatively small. In the ratchet limit, the least-loaded class 
decreases exponentially until it hits the characteristic size l/(A^|iS'|) and is lost. 
In the adaptation limit, the least loaded class is created due to a mutation 
and becomes established, i.e., survives and grows further with probability on 
the order of 1/2, when its frequency in a population exceeds 1/(A^|S'|). The 
cost of treating a characteristic frequency 1/(A^|S'|) as a sharp threshold is 
an undefined numeric factor multiplying the population size, A^. Because the 
speed of evolution, as it comes out in the end, depends on A^ logarithmically, 
and the interesting values of A^ are quite large, the error is relatively small. 

The deterministic formalism described in the previous section predicts a monotonous 
dependence on time for a fitness class (the time derivative changes sign only 
at /c = fcav)- In fact, because the mutational load of the least-loaded class 
changes abruptly in time, the size of the current least-loaded class oscillates in 
time producing a saw-like dependence. In the case of the MuUer's-ratchet, the 
class contracts until lost (Fig. [3]), and the next-loaded class takes it place, and 
in the case of adaptation, the class expands until beneficial mutations create a 
new least-loaded class (Fig. [6]). To match the two formalisms, we require that 
the logarithm of the solitary wave at the edge, 4>{xo), is equal to the value 
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In fkg (t) averaged over one period of oscillations, as given by 



ln^ + ln/„ 



(19) 



where /max is the maximum size of the least-loaded class size. The value of /max 
is given by /fco(t) at the moment t when either the previous least-loaded class 
was just lost (ratchet case) or a new least-loaded class was just established 
(adaptation case). In the next two sections, we derive the value of /max and 
determine 0(xo) from Eq. (fT9|) in each case separately. To partly compensate 
the error introduced by discreteness of k and matching the continuous fitness 
distribution to the oscillating edge, below we derive corrections to the con- 
tinuous approximation. In the case of adaptation, the uncompensated error 
in the substitution rate amounts to 10-20% in the entire parameter range we 
tested (see simulation in Section 4). 

We note that the time dependence of the classes other than the least-loaded 
class also has an oscillating component. The period of these oscillations is the 
same as for the least loaded class, corresponding to a shift of the wave by one 
notch in k. Because the average class size increases exponentially away from 
the edge, the relative magnitude of oscillations decreases rapidly away from 
the edge, and the described method of matching the edge to the bulk yields 
fair accuracy (see simulation in the next two sections). The remaining effect 
of oscillations will be partly accounted for by the correction for discreteness 
of k, which we introduce in the next two sections. 



3 Muller's ratchet 



In the absence of beneficial mutations, all genomes have at least as many mu- 
tations as their parents. Therefore, if the class of mutation-free genomes is lost 
from the population because of genetic drift, it cannot be regenerated in an 
asexual population. Moreover, now the class of one-mutants is at risk of being 
lost, and once it is lost, the class of two- mutants is at risk, and so on. In this 
way, an asexual population will inexorably experience accumul ation of muta- 
tions and decay of fitness. This process was first described by iMuUerl (119641 ) 
in an argument for why sexua l reproduction is beneficial, and is commonly 
referred to as Muller's ratchet (iFelsensteinl . Il974l ). 



In our study, the MuUer's-ratchet regime corresponds to the limit of high 
average fitness, a ^ 0. In this case, beneficial mutations play a negligible role, 
and the speed v of the wave is positive, < f < 1. The following derivation 
is valid if s <^ ?7, which condition ensures that the high-fitness tail of the 
distribution is long. Incidentally, the same condition implies a large half-width 
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of the wave. (Note that the two conditions differ in the case of adaptation, see 
the next section). We consider the case of moderately large population sizes, 
In < ?7/s (a more accurate condition is given in Appendix A.l), when clicks 
of the ratchet are sufficiently frequent, so that the fitness distribution never 
equilibrates and moves almost continuously. In the opposite case of very large 
population sizes,, ratchet clicks are rare, and the population can equilibrate 
between subsequent ratchet clicks. The ratchet rate is exponentially small 
and is estimated from the average time the fittest class takes to drift from its 



equilibrium value to an occupan cy of zero (iHaighl . 1 19781 : IStephan et al 
Gordo and Charlesworth . 2000a bl). 



1993 



3.1 Solitary wave approach in the ratchet limit 



Before we discuss the treatment of the stochastic edge in this case, we simplify 
the deterministic part of our theory, that is, the right hand side of Eq. (fT7|) . 

From the expression for m, Eq. (fT3|) . we find for a = 0: 

u = l/v. (20) 

With this expression, we obtain from Eq. ([1] 



xo = --[1-f ln(e/t;)]. (21) 
a 



Similarly, Eq. (fT7|) becomes 



In^ - + 1 



(22) 



As already stated, the approach is valid if the high-fitness tail is long, |xo| S> 1 
(Simplification 4), which reduces to the condition cr -C 1. Thus, our treatment 
does not predict Muller's ratchet in a broad parametric interval unless the 
selection coefficient is much smaller than the total mutation rate. In this case, 
the half-width of the wave is also large, and 0(0) in the left-hand side of 
Eq. (122]) is given by 

'/»(0) = ln,/^^^5^, (23) 
which represents Eq. ( TTOl) with a = 




We will now derive an expression for 0(xo) based on the stochastic treatment 
of the least-loaded class. Because a = in the Muller's ratchet limit, we set 
M(t) = in Eq. Further, from S = U + s{ko - k^^) = U{1 + axo), we 
obtain 

S = Uv\n{e/v), (24) 
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where we have made use of Eq. (121]) : we have S" > at all f < 1. After 
integrating Eq. (fT8|) . we find that the expected frequency of the least-loaded 
class decays as 

fkoit) = hMe-'\ (25) 

where t = is the point in time right after the previously least-loaded class 
ko — 1 has been lost. Eq. (I25p applies only while fko{t) is much higher than 
the stochastic threshold 1/{NS). When the stochastic threshold is reached, 
random genetic drift overcomes effects of selection, and the ko class is lost. 
Note that the stochastic threshold is defined within the accuracy of a numerical 
coefficient on the order of 1. 

The next step is to couple the dynamics of the least-loaded class to the move- 
ment of the wave. According to the approach outlined in Section [221 "we match 
the edge value 0(xo) to the average In fk^it) during the time in which the 
least-loaded class remains above the threshold condition. Using Eq. f|T9|) with 
/max = /fco(O), we obtain 



ln/,„(0) + ln — 



(26) 



As per the definition of f, the wave moves one notch in time tcUck = Vl^"^)- 
The time until the least-loaded class reaches the stochastic threshold condi- 
tion, tioss, follows from Eq. fl2^ as 

^ /fco(0)e-^*'°-. (27) 



SN 

After equating tciick and tioss, we find 



We now insert Eq. fl28l) into Eq. fl26|) and obtain with Eq. ([2i 

0(xo) = - \\y{NUv^l'^ ln(e/t;)] . (29) 

Because the stochastic threshold 1/(A^5') is an estimate defined up to a nu- 
merical coefficient not to exceed the accuracy of our calculation, we set the 
numerical coefficient inside of the logarithm to 1. The missing numerical coef- 
ficient introduces a small error, because we have assumed that the population 
size and, hence, the argument of the logarithm are large. 



Previously, iRouzine et al.l (120031 ) matched 0(xo) directly to the stochastic 
threshold ln[l/(A^S')], which leads to a slight underestimation of ^(xo). The 
only difference from the old treatment is an additional factor of w^^^ in the 
argument of the outer logarithm in Eq. (l29ll . 
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We neglected the stochastic effects acting on the next-loaded class, k = ko — 1, 
on the grounds that its size is larger than then the size of the least-loaded 
class, which is on the order or larger than the stochastic threshold; hence, the 
random drift term in the two-allele diffusion equation for the next-loaded class 
can be neglected as compared to the selection term. The average ratio of the 
two sizes can be estimated, in the deterministic fashion, from the logarithmic 
slope of the fitness distribution, as given by fko-i/ fko = exp[(/)'(xo)] = u = 1/v, 
Eq. fl20|) . Thus, in the typical case when v is not close to 1, the next-loaded 
class is, at least, several-fold larger than the least-loaded class. The error 
introduced by the approximation is, again, only a numeric factor at A^, Eq. 
(1291) . Computer simulation confirms the small numeric effect of the error (see 
below). 



3.2 Correction for discontinuity at k = ko 



In the derivation of the wave equation, we made the assumption that In fk+i — 
In fk can be approximated by din fk/dk (Simplification 4). This approximation 
is valid if the first derivative of In/^ does not change much from k to k + 1, 
that is, if In is approximately linear between k and k + 1. We can state this 
condition more formally by demanding that 



dlnfk 



dk 



> 



dk^ 



(30) 



With din fk/dk = 0', we can rewrite this condition as 

dx 

> 1 



(31) 



For f ~ 1 and |x| ~ |xo|, from Eq. f[T^ at a = and Eq. fl2Ul) . we estimate 
(f)'{x) ~ 0'(^o) = In('u) ~ 1, \dx/d(f)'\ ~ l/a. Thus, for a fitness class some- 
where in the middle of the tail and f ~ 1, the continuity condition Eq. (!3T!) is 
equivalent to a <C 1. Therefore, the tail must be long, |xo| ^ 1, Eq. (12T1) . as 
we already mentioned several times on intuitive grounds. However, near the 
fitness edge x = Xq, we have dx/dcp' = 0, because we defined Xq to be the 
minimum of x{(j)'). Therefore, the condition is violated very close to the edge. 
This effect is especially important at f ~ 1, where the condition on cr becomes 
more restrictive. Indeed, the tail length vanishes at f — >■ 1, Eq. fl2Tl) . 

Condition is trivially violated in a narrow vicinity of the wave center, 
X = 0, where 0'(O) = 0. Because the first derivative in the left-hand side of 
Eq. (!30l) is zero, a more appropriate condition of continuity in that region is 
that the second derivative is much larger than the third derivative, which is 
met at cr -C 1. 
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Right at the edge, In/fc grows faster with k than our continuity approximation 
allows for. The inflation of the In/^ values for k close to ko spreads inwards 
towards higher k values in a deterministic fashion (see Fig. H]). We can in- 
vestigate the effect of this perturbation by considering the discrete balance 
equations near the edge, 

^ = -Sho , (32) 
^ = -Sh + Ufk-i, k>ko + l, (33) 

with S = Uvln{e/v). These equations follow directly from Eq. ([3]) if we ap- 
proximate S with its value at the edge, as given by Eq. ( 12^ . As we show in 
the appendix, this set of differential equations has a solution under periodic 
initial conditions of the form /fc_i(0) = fk[l/{Uv)]. In other words, under the 
stationary process, the wave moves one notch in time l/{Uv), periodically 
repeating its shape near the edge. 

Note that Eqs. fl32l) and fl33l) apply only close to the edge, because we have 
assumed that S is constant in k. Far from the edge, we have to use instead 
a more general equation, which we solve in the continuous approximation as 
described above. 

Our continuous approximation predicts that the difference between In fk and 
In /fcp near the high-fitness edge. A; — /eq ^ |xo|, is given by the linear expression 
d\nfk/dk ^ (()'{xo){k — k^) = \n{l/v){k — k^), where the last equality follows 
from Eq. fl20|) . However, when we integrate the set of differential equations 
Eqs. f l32|) and fl33|) . and compare the values of \nfk{t) and lufk^it), for ex- 
ample, in the middle of one cycle, t = l/{2Uv), we find [Eq. flA.18|) in the 
appendix] 



• (34) 



In other words, on the right hand side of Eq. (1221) . which corresponds to our 
estimate of the difference 0(0) — 0(xo) from the continuous approximation, we 
have to add a correction term of the form ln[(2/-ye)(|xo| +5/6)] to account for 
the fact that, given (pixo), the value of 0(0) is slightly larger than what our 
continuous approximation predicts. Note that this correction term does not 
depend on any model parameters. With xq given by Eq. fl2T|) . the correction 
term becomes 

ln\^(l-v\n- + —)] -Ina. (35) 
.^/e\ V 6 /. 



Combining the correction term with Eqs. fl22|) . (l23l) . and (129!) . and dropping 
all the numerical constants multiplying N inside the logarithm, we arrive at 
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our final result 



a\n{NUa 



3/2N 



- In - + 1 

2 V V 



crln 



ln(e/f ) 



1 — vl — t'ln(e/f) + 5(j/6 

(36) 

This expression relates the selection strength a = s/U, population size N, and 
mutation rate U to the normalized ratchet rate, v = {1/U)dk^^/ dt. We can 
evaluate this expression in two ways. First, we can assume the point of view 
that Eq. fl36l) describes as a function of v and model parameters. Thus, 
we can directly plot N over the entire range of (0 < f < 1). The solid 
lines in Fig. [5] were generated in this way. Alternatively, we can solve Eq. fl36l) 
numerically for f at a given value of A^. 



We note that, at very small a and large A^, the second term in the right- 
hand side of Eq. fl36l) . originating from the stochastic edge treatment and the 
correction for discontinuity, can be neglected. In this limit, the substitution 
rate normalized to the mutation rate is expressed in terms of a single composite 
parameter ahi^NUa^^"^). When this parameter is equal to 1, the ratchet rate, 
i n this limit, becomes ex a ctly ze ro. In fact, the ratchet speed is predicted 
( Gordo and Charleswortlil . 2000a ) to be finite albeit exponentially small at 
a hi{NU cr^/^) > 1, when loss of the fittest genotype occurs so infrequently that 
the solitary wave approach breaks down (see below). At realistically small cr, 
the second term in Eq. fl36l) is important, as we show below by simulation, 
especially near f = 1, where the first term vanishes as (1 — w )^, and at very 
small f , where the second term smears out the critical point a \n.[NUa^^'^) = 1. 



The selection coefficient s enters Eq. fl36l) through its rescaled version a = 
s/U. Likewise, the ratchet speed —V = dk^v/dt enters in the rescaled form, 
V = —V/U. On first glance, this scaling looks unusual in comparison to the 
standard diffusion limit. In this limit, all result s are expresse d in terms of 
time rescaled with the population size, t' = t/N flEwensl . 1200J), and rescaled 
mutation rate and selection coefficient, U' = NU and s' = Ns. However, it 
is straightforward to express Eq. (!36l) in terms of the standard diffusion limit 
scaling. Note that a rescaling of time with A^ means that we also have to rescale 
the ratchet rate, V = NV, and note further that v = —V/U = —V'/U' and 
a = s/U = s' /U'. Hence, if we replace the factor NU on the left-hand side of 
Eq. 0361) with [/', Eq. 0361) is given entirely in terms of the rescaled quantities 
of the standard diffusion limit, V, s' and U'. 



3.3 Comparison with simulation results and other studies 



We carried out simulations of MuUer's ratchet as described (IRouzine et al. 



20031 ). and compared the measured ratchet speed to the speed predicted by 
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Eq. (136|) (Fig. [5]). As shown previously (iRouzine et al.l . l2003l ). the travehng- 
wave theory leads to accurate predictions of the ratchet rate over a wide range 
of population sizes. Even though the theory is derived under the assumption 
that is large, we find that our prediction for the ratchet rate is accurate 
already for population sizes as small as = 10, and continues to be accurate 
throughout the entire range of biologically reasonable population sizes, until 
a\n(NUa^^'^) becomes larg er than 1. Thu s, our results connect expressions for 
the ratchet rate derived by iLandd (Il998l ) or iGordo and CharlesworthI (l2000al ) 
that work, respectively, for either very small or very large population sizes. 



The transition to the regime of lar ge A^ occurs when the size of the le ast-loaded 
class is large, N exp{—l/a) ^ 1 (iGordo and CharlesworthI . l2000al ) (Fig. 5c). 
Then, clicks of the ratchet are rare, and the population is at equilibrium most 
of time. In contrast, in the intermediate interval of A^ we study, the least-loaded 
class is not sufficiently large and is frequently lost. Therefore, a population 
does not have time to equilibrate between clicks. The non-equilibrium nature 
of the ratchet is witnessed by an almost constant, non-zero effective selection 
coefficient of the least-loaded class, S > and the fact that the width of the 
distribution is smaller than the equilibrium wid th, ■\^/V ar[fc] < \ju/s, Eq. f|TT]) . 
The transition to the regime of small A^ (Lande, 19981 ) occurs at A^s ~ 1, when 
fixation events of deleterious mutations at different sites are separated in time, 
and one-site theory applies. 



We found that the term correcting for the discontinuity at k = k^, Eq. fl35|) . 
substantially improved the agreement between the numerical simulations and 
the analytical prediction of the ratchet speed for a large interval of population 
sizes (Fig. [5D). While the traveling- wave approximation (based on continuous 
treatment of fitness classes) combined with the stochastic treatment of the 
highest-fitness class is good enough to make useful predictions about the speed 
of MuUer's ratchet, the discreteness correction is necessary to achieve high 
prediction accuracy. The correction term becomes irrelevant only for very large 
population sizes, where the ratchet speed is dominated by the first term on 
the right-hand side of Eq. ( |36l) (Fig. [5]C, dashed lines). 



Note that iRouzine et al.l (120031 ) also used a correction term to account for 
the discontinuity at k = k^. However, their term was obtained by fitting an 
interpolation formula to the numerical solution of Eqs. (132|) and (l33l) . By 
contrast, here we derived the correction term analytically, as the asymptotic 
behavior near the high-fitness edge of the wave. 
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4 Speed of adaptation 



If a population experiences a sudden change in environmental conditions or is 
introduced into a new environment, it will initially be ill-adapted. Over time, 
the population will accumulate beneficial mutations until a new mutation- 
selection balance is reached. The process of accumulating beneficial mutations 
is called positive adaptation, and the rate at which fitness changes over time 
is the speed of adaptation. 

In our model, in the case of positive adaptation the speed of the wave is 
negative, f < 0, because the wave moves towards smaller mutation numbers k. 
Furthermore, we make the assumption that we are far from mutation-selection 
balance, |f| ^ 1, i.e., \dks_-y/dt\ ^ U. In this regime, the rate of deleterious 
mutations is a small correction that can be neglected, and only the beneficial 
mutation rate affects our results. Therefore, it is more convenient to introduce, 
instead of v, the average substitution rate of beneficial mutations V, as given 
by 

V = = -Uv, (37) 

dt ' ^ ^ 

and the beneficial mutation rate f/b = o^U. The derivation that follows applies 
if s is much larger than Ub, and the population size is sufficiently large, so that 
V ^ s/ ln{s/Ub). The latter condition ensures that the high-fitness tail of the 
distribution is long and the whole approach is valid. Within this validity region, 
we will consider two cases: moderate substitution rates, s/ \n{s/Ub) -C <^ s, 
where the half-width of the fitness distribution is small, and relatively high 
substitution rates, V s, where the distribution is broad. 



4-1 Solitary wave approach in the adaptation limit 



We expand u, xq, (f>'(xo) [Eqs. (fT3|) and (fT^ ] and the right-hand side of Eq. (fTTIl 
for large negative v (dropping all terms small compared to |z;|), and find 

V V 
u = —, (xo) = Inw = In — , (38) 

V / V \ 

and 

W)-*(^o) = |(to^^ + l). (40) 



The approach rests on smoothness of In /fc. We obtain the continuity condition 
for In/fc as before from Eq. (I3T1) . using 0' ~ Inw = ln(y^/t/b) and Eq. (fT2|) . In 
Eq. (JT2I) . we keep only the third term in brackets: the first term is small, and 
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the second term is important only near the edge. The resulting condition, V ^ 
s/ \ia(y/Ub), is equivalent to |xo| ^ 1. Just as in the case of Muller's ratchet, 
the high-fitness tail must be long for our approach to work (Simplification 4). 

Unlike in the ratchet case, even though the high-fitness tail is long, the half- 
width of the wave is not necessarily large. Large adaptation rates (large popu- 
lation sizes) correspond to a broad distribution, Var[/c] 3> 1, and intermediate 
adaptation rates (intermediate population sizes) correspond to a narrow dis- 
tribution, Var[/c] <^ 1. For the two respective cases, from Eqs. (|TOl) and ([7]) 
with > [/ (f < 0, |f I > 1), we obtain 



0(O) = -iln^, (41) 
2 s 

0(0) = 0, s/ ln(r/t/b) < V < s. (42) 



Again, we have to couple the deterministic wave equation to the stochastic 
behavior of the least-loaded class following the approach in Section 2.7. The 
condition that only the best-fit class should be treated stochastically reads 
V 3> Ub, which is equivalent to s|xo| ^ Ub (Simplification 7 and the end of 
this subsection). 

Far above the stochastic threshold, the dynamics of the least-loaded class 
follows Eq. (fT8!) with the effective selection coefficient S and the mutation 
term M given by 

S=U + sxo^-Vln{V/U^), M = U^fu,+i. (43) 

Thus, the effective selection coefficient is negative, causing exponential expan- 
sion of the class. Unlike in the ratchet case, beneficial mutation events cannot 
be neglected: Their role is to add more and more clones to the class, result- 
ing in a time- dependent prefactor influencing the growth of the class on large 
time scales. However, the logarithmic derivative of that growth is primarily 
determined by selection alone. To demonstrate the validity of this statement, 
we evaluate the relative magnitude of the two terms M and SfkQ in Eq. ffTSj) . 
From Eq. fH3|) . we have 

According to our continuous approximation, ln(/fc„+i//fc3) = d\n. f^^/ dkQ = 
(p'ixo). Thus, with 0'(xo) = MV/Uy,) [Eq. ([38])], we find 
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Therefore, we can safely neglect the mutation term M in the log-derivative 
of fko in time and approximate it with IS"!, which fact will be used in the 
derivation below. 



The dynamics of the current least-loaded class fkoif) above the stochastic 
threshold l/(iV|S'|) represents a periodic saw-shaped dependence (Fig. 6). 
When beneficial mutations occurring in the least-loaded class generate a new 
least-loaded class with a size on the order of the stochastic threshold, with 
probability on the order of 1, the new class will survive random drift and 
be further amplified by selection. We refer to this event as "a class is es- 
tablished". After a class is established, the wave moves one notch in k. By 
definition, the time period between establishment of two consecutive classes 
is given by tperiod = 1/^- Below we estimate the size of the best-fit class 
/u^ax = /fco (^period) at the moment it gives rise to a new established class. 

The total number of mutational opportunities (number of genomes that may 
potentially generate a beneficial mutation during one click) is multiplied 
by the time integral of fkoif) over one period, which is A^/max/|5'|. In other 
words, because of the exponential growth of fkoif), most of the mutational 
opportunities will arise in a short time interval of length l/IS*! around the time 
when fkoif) has come close to the value /max (Fig. ED- We note that that time 
interval is much shorter than the time of one click, '\-/\S\ = [l/V) / ln{V/U]^) -C 
1/V. We obtain the average number of new least-loaded genomes created 
during one period by multiplying the number of mutational opportunities by 
the beneficial mutation rate t/b- Fin ally, each of the s e benefic i al mu tations have 



a probability 2|S'| to survive drift (IHaldand . 119271 : iKimural . Il962l ). Therefore 



the total number of beneficial mutations that arise and go to fixation during 
one period is 2|S'|f/b-A/max/|'S'| = 2t/b-/V/inax- We find the desired value of /max 
from the condition that, on the average, one new fitness class is established, 
which yields 

/™.«^. (46) 

Based on Fig. 6 and Eq.( H6l) . we can also estimate the time of one click 1/V 
in terms of S" = sxq, as given by 

which is similar to the expression 1/V = {l/\S\)ln[V/{eUb)] obtained from 
the deterministic part of our derivation, Eq. fl5I?l) . The difference is in the 
logarithmic factor 15*1/1^ ~ ln{V/Ui,) in the argument of the large logarithm in 
Eq. (H7!) . At the moment, we do not understand the discrepancy. The fact that 
V as a function of |a;n| can be obtained from cons idering either the stochastic 



edge (jPesai and Fished . 120071 : iBrunet et al.l . 120071 ) or the deterministic " bulk" 



is certainly not trivial and deserves further inquiry. 
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The above estimate of /max operates with fitness classes and does not depend 
on the particular clone composition of a class. Still, for the sake of clarity, we 
would like to make a comment on the clone composition of the least-loaded 
class. The estimate of /max (defined within the accuracy of a numeric factor on 
the order of 1) corresponds to the short interval of time ~ ^/\S\ when a new 
l east-loaded class is in the vicinity of the stochastic threshold, fk^ ~ 1/(A^|S'|) 



( iKimura and Ohtal . Il973l : iBartonl . Il995l : iRouzine et al.l . l200ll ). At that time 



the new class is comprised of a small number of clones (most likely, one or two). 
Later, as the next-loaded class increases nearly exponentially in time above 
/max, it generates additional clones joining the least-loaded class. Because the 
additional clones cross the stochastic threshold later than the first clone(s), 
they are consecutively smaller in size. As a result, the growing class consists 
of an increasing number of clones with decreasing relative sizes. However, the 
eventual clone composition of the best-fit class does not influence the estimate 
of /max, because it is formed at later times. 

Now, following our general approach, Eq. (fT9ll . we match 0(xo) to the average 
between the minimum and the maximum value of fk{t) under deterministic 
growth (Fig. [H]), 



0(Xo) 



1 



2 
-In 



In /max + In 



1 



\S\N 



NJVU^\n{V/U^] 



(4J 



In comparison to the previous result by iRouzine et al.l (120031 ) who matched 
(f){xQ) directly to the stochastic threshold, the result found with the improved 

treatment of the stochastic edge differs by a factor of \J (V/U)^)/ \n^{V/U\^) 
multiplying A^. 



As in the case of Muller's ratchet, we can calculate a correction to the contin- 
uous approach that accounts for the deterministic perturbation of the wave 
shape caused by the discreteness of fitness class near k = ko. The calcula- 
tion is similar to the one for Muller's ratchet, and the details are given in the 
appendix. The final expression for this correction term, Eq. (lA.35p . reads 



In 



2{k -ko) + l + 



1 



ln(y/f/b 



(49) 



We now replace k — kohj \xo\, and neglect the term +1 in the above expression 
and the term —Vs in |xo| [Eq. fl39|) ]. using the strong inequality ln{V/Uh) ^ 1. 
Then, the correction term becomes 



'V V 
In ( — In — - 

s Ub 



(50) 
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Inserting Eqs. (H8l) and (HT]) into Eq. (HOl) . adding the correction term (150!) 
to the right-hand side of Eq. (HOl) . and dropping all numerical constants mul- 
tiplying inside the large logarithm, we arrive at our final result for large 

V . , ... ^^^^ 



InA^ 



2s V ef/b 



+ 1 - In , 



^\/2in(\//?7b) 

For intermediate V , 0(0) = from Eq. fH2|) . and the final result reads 



InA^^ — In^ + 1 

2s V ef/b 



In, 



S2f/b 



\ l^ln(\//t/b 



s/ln(y/f/b) < y < s. (52) 



The difference between these two results is a factor of \JV/s multiplying the 
large population size A^. As we found in simulations, the numeric effect of 
that difference is quite modest in a very broad parameter range. As in the 
case of Muller's ratchet, we can evaluate expressions (jSTl) and ( !52l) either by 
calculating as a function of V , or by numerically solving for V at a, given 
value of A^. 

As in the case of Muller's ratchet, we can express Eqs. flSTj) and fl52|) in terms of 
variables encountered in the standard diffusion limit, V = NV, s' = Ns, and 
= NUb. First, note that V/s = V'/s' and V/Ui, = V /U^ Second, after 
subtracting InA^ from both sides of Eqs. ( 15T|) and ( l52i) . we obtain exactly 



the diffusion limit scaling for the terms Inys'^t/b/^^ and In \J s'^Uh/V on the 
right-hand sides of these two equations. 

For extremely large A^, we can obtain an explicit expression for V from Eq. (!5T!) 
through iteration. In zero approximation, we have V ^ s. We insert this value 
into the logarithms in Eq. fl5ip . substitute the resulting expression again into 
Eq. fl5T]) . and then neglect all numeric constants and terms of the form ln(s/f/b) 
inside of large logarithms. We also neglect InA^ when it multiplies A^ in the 
argument of a logarithm (because A^ is assumed to be very large). We find, to 
the first order, 

2s In^N^sU^) 



V 



ln2[(s/f/b)ln(Arv/^)] 



(53) 



The m ain difference from the previous asymptotic result of iRouzine et al 



( 120031 ) is the factor -y/ s Uh instead of Uu multiplying A^. [Note that in the cor- 
responding equation in (IRouzine et al.l . l2003l ). Eq. (33) in the Supplementary 
Text, the factor multiplying A^ was accidentally omitted.] 



We assumed that only the best-fit class should be treated stochastically and 
that the next-loaded class, k = — 1, can be treated deterministically. The 
validity condition can be found from the requirement that the average size 
of the next-best class is much larger than the size of the best-fit class, which 
is either on the order of or larger than then stochastic threshold. Indeed, in 
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this case the random drift term in the diffusion equation for the next-loaded 
class can be neglected as compared to the selection term. The requirement 
also ensures that, when a best-fit class gives rise to a new best-fit class, the 
former is much higher than the stochastic threshold. The validity condition 
is \^ ^ Ub, which is equivalent to s|a;o| ^ Ub (Simplification 7). Because 
we already assumed the strong inequality V Ub when deriving Eqs. flSTj) . 
(1521) . and fl53l) . the next-loaded class can safely be considered deterministic in 
the entire region where the results of this section apply. Thus, unlike in the 
case of Muller's ratchet, where the approximation of a single stochastic class 
introduces an error corresponding to a numeric factor multiplying A^, in the 
case of adaption, the approximation is asymptotically accurate. 



4.2 Comparison with simulation results for the adaptation rate 



We c arried out simulations of adaptive evolution as described (IRouzine et al. 
20031 ). and compared the measured speed of adaptation to the speed pre- 
dicted by Eq. (!5T!) in a wide range of parameter settings (Fig. [7JA--C). Because 
deleterious mutations can be neglected in the limit V ^ U we consider, the 
simulation results we present here were obta i ned in the absence of deleterious 
mutations, U = U\^, whereas iRouzine et al.l (120031 ) considered beneficial and 
deleterious mutations at the same time. 



Without the discreteness correction, Eq. (jUj), the analytic prediction of the 
wave speed generally follows the simulation results fairly well in a broad pa- 
rameter range. The analytic result consistently overestimates the simulation 
result by 15 to 25% (Fig. [7]^ to C, dashed line). When we include the dis- 
creteness correction, the accuracy improves significantly across all parameter 
settings. In general, the accuracy becomes higher as f/b and s decrease. (If 
deleterious mutations are present, U > Uh, the predictions from traveling 
wave theory are at least as accurat e — if not more so — a s in the absence of 
deleterious mutations; see Fig. 2f in IRouzine et al.l (120031 ) ). 



Fo r comparison, we s how a nalytic results for the substitution rate obtained 
by iDesai and Fisherl (120071 ) who used a different approach [Eqs. (36), (38), 
and (39) in the cited work]. Their method applies at small adaptation rates 
where the lead |a;o| is only moderately large (see our Discussion) . In its validity 
range, the predic t ion is very similar to ours (Fig. [7p, dotted line). Note that 
Desai and Fisherl (120071 ) in their Fig. 5 compare with simulation not the actual 
analytic result, but its simplified version, Eq. (41), which accidentally gives 
a higher accuracy than the actual result. When the lead is large and the 
approach breaks down, their analytic prediction is far from simulation results 
(Fig.ElB, dotted line). 
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4-3 Testing stochastic edge treatment: Simulation of a two-class model 



In order to validate our results and to get further insight into the dynamics 
of the stochastic best-fit class, we carried out a computer simulation for a 
simplified model i ncluding only two fitne ss classes, the best- fit class, and the 



second-best class (jPesai and Fisherl . 120071 ) . Only the best-fit class was treated 



stochastically. The aim was to confirm the analytic estimate of the size of the 
second-best class when the new best class is established, Eq. fH6l) . The way 
it was derived, the estimate is expected to be robust with respect to such a 
model simplification. Note that the two-class model cannot be used to test 
other intermediate results of this section. 

According to the two-class model, we assume that the frequency of the second- 
best class grows as 

fko-iit) = TT^e^^l^^l-i)*, (54) 
Ns\xo\ 

where l/(s|a;o|) is the characteristic size where selection prevails over genetic 
drift. (In other words, we ignore stochastic effects and mutants coming from 
less-fit classes.) According to the main model, the average frequency of the 
best class grows as 

h,{t + 1) = e^l^«l/fc„(t) + ^b/fco-iW = e^'"«l/fco(t) + -i^e^d-ol-D*. (55) 

i V S |Xo I 

In our pseudorandom simulation, we calculated the actual size of the best class 
^fko [t) over a long time scale using Poisson statistics with the average given 
by Eq. fISS]) . Then, to find the time delay between the growth of two classes, we 
extrapolated /a:o(^) back from large times t 3> l/s|xo| when the best-fit class 
was large enough to be treated deterministically, N/k^ ^ l/s|xo|, as follows. 

We seek a solution of Eq. (!55|l in the form fko{t) = y4(t)e*'^°'*. Inserting this 
expression into Eq. (l55l) and solving the equation in discrete time with respect 
to A(t), we obtain 



AiO) 



s|Xo| 1 — 6' 



St 



(56) 



where A{0) is an arbitrary constant. Strictly speaking, A{0) is random and 
has to be found from the stochastic edge simulation. 

Instead of ^(0), we introduce a more transparent parameter, the time delay 
between establishment of two consecutive best-fit classes r, as given by the 
periodicity condition 

A.(r)=A.-.(0) = ^. (57) 
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From Eq. (156|) . the parameter r is related to ^(0) as given by 



A{0) 



S\Xo\ 



(58) 



In terms of r, the growth curve of the best-fit class, Eq. ( !56l) . can be written 

as 



Ns\xo\ 



os\xo\t 



^-s\xo\r ^ lJ^Q-s\xo\ 



~st 



1-e- 



(59) 



We obtain r by measuring fkoit) in simulation at some large time t, such that 
t ^ l/s|xo|, and then solving Eq. ( l59i) numerically for r. We checked that 
changing the sampling time t within the range r to lOr has a small effect on 
the result for r, which confirms the vadity of our back-extrapolation procedure. 



Finally, in order to test the accuracy of our analytic estimate, Eq. fH6l) . we 
calculate the constant C in the stochastic edge simulation, as given by 



C = NU^fk,-iiT) = NU^f^ 



(60) 



We averaged the value of log C over 500 simulation runs. The result is shown in 
Fig. Ofor various values of |xo|, s, and Ub- In principle, the analytic result that 
C = const ^ 2 is supposed to be asymptotically correct at |xo| ^ 1, s|xo| ^ 1. 
In the parameter region |xo| > 5 and s|a;o| < 0.2, the value of C varies between 
1.5 and 2.5 . (Note that even though Eq. (!60|) seems to imply that C depends 
on A^, this factor actually cancels.) Because C is only a numeric factor at N, 
and is large and enters a logarithm in the expression for the substitution 
rate, such variation of C has a small effect on the final substitution rate and 
thus confirms the validity of our analytic estimate, Eq. fH6|) . 



It is important that, due to constant influx of new beneficial mutants from the 
second-best class, the best-fit class size does not grow exactly exponentially, 
Eq. fl59p . We checked that replacing Eq. (1591) with an exact exponential in 
the back-extrapolation procedure seriously affects the result for r and makes 
it depend on the (arbitrarily chosen) sa mpling time t (re s ults n ot shown). 



This facts explains why the approach by iDesai and Fisherl (120071 ). who used 



the exponential back-extrapolation to estimate r, works only at very small 



adapt ation rates when |xo| ~ 1 [Fig. [7]) (see Discussion and iBrunet et al 



(120071)]. 



4-4 Crossing over to the single site model 



We expect intuitively that, for extremely large population sizes, the speed 
of adaptation should cross over to its deterministic, infinite-population-size 
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behavior. In the deterministic hmit, the process of creation and fixation of rare 
mutants does not affect the speed of adaptation, because all possible mutants 
are instantaneously present. Instead, the speed of adaptation depends simply 
on the growth rate of the mutants at different sites. As a consequence, in the 
mulitplicative fitness model, linkage between sites becomes irrelevant, and we 
can describe the system behavior with the single site model, 

f = s/(l-/)+/ib(l-/), (61) 

where / is the frequency of the beneficial allele at the given site, and we 
have assumed that deleterious mutations do not occur. The time it takes for 
a beneficial clone to grow from size to size /o is T = ln[(s/o + /ib)/(Aib — 
/o/^b)]/(s + /ib)- Setting /o = 1/2 and assuming s ^ /ib, we find that the 
half-time of reversion, in which the beneficial clone grows to 50% presence, is 

Ti/2 ^ (l/s) ln(s//ib) . (62) 

The same equation applies to a population of genomes that contain multiple 
sites which are independent and unlinked. Thus, if at some time point there 
are fcav deleterious alleles fixed in the population, then Eq. fl^ gives the time 
interval until, on average, all members of the population have reverted half of 
these deleterious alleles. 

To test our intuitive expectation, we will now compare this expression to the 
corresponding expression from traveling wave theory. For a wave centered at 
position fcav, the time to revert half of the deleterious mutations is 

- ^ . (63) 

After inserting Eq. fl53l) into Eq. fl63l) and keeping only the leading terms in 
A^, we find that at InA^ ~ kg^^ln{s/ fi\j), the half-time of reversion crosses over 
to the single-site expression Eq. flB21) (under the assumption that s ^ fi^). The 
reason for this behavior is that once InA^ grows to fcav lii(s//ib), the left edge 
of the wave reaches the mutation- free genome. At this point, the assumption 
ksLv ^ ^av — ^0 is violated, and the traveling-wave approach ceases to be valid. 
For population sizes exceeding fcav ln(s/yUb), the system behavior is essentially 
independent of population size, and linkage does not slow down the speed of 
adaptation. 



5 Discussion 

We have derived accurate expressions for the speed of MuUer's ratchet and 
the speed of adaptation in a finite, asexual population. Our expressions are 
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numerically valid in a wide paramet er range. The main di fference between the 
work presented here and the work of iRouzine et al.l (120031 ) is an improved and 
more intuitive treatment of the stochastic egde, and the analytic derivation 
of correction terms for the discontiuity at k = kg for both MuUer's ratchet 
and the speed of adaptation. The latter correction terms lead to significantly 
improved numerical accuracy in a wide range of population sizes. 



5.1 Muller's ratchet 



A qua ntitative treatment of Muller's ratchet was first attempted by iHaigh 
(119781 ). who studied primarily the case in which ratchet clicks are rare, so 
that the population can equilibrate between subsequent ratchet chcks. Since 
then, numerous studies have derived iniproved estimates of the ratchet rate 
(IPamilo et aLl.ll987l:ISteDhan et al. |. [l993l: lGesslerl.ll995l:lHiggs and Woodcqcki . 
19951 : IPriigel-Bennettl . Il997l : iLandel . Il998l : ICordo and Charlesworthl . boOOaP bF). 



Most approaches to calculate the ratchet rate work either in the case N < 1/s, 
when fixation events of deleterious alleles at different sites are well separated 
in time and the one-site theory applies, or in the limit of very large population 
sizes, when the deterministic expectation for the size of the fittest class no = 
Nexp{—U/s) is much larger than 1. In the former case, the ratchet rate is 
estimated as the rate at which deleterious mutations enter the pop ulation , 
multiplied with the probability of fixation of deleterious mutations (ILandd . 
19981 ). In the latter case, the ratchet rate is estimated from the average time 
the fittest class t akes to drift from its equilibrium value t o an occupancy of zero 



(IStephan et al.l . Il993l : ICordo and Charlesworthl . l2000al lbh. A third approach 



most closely related to the present work, is a quantitative genetics approach 
whereby the fitness distribution is described by its mean, variance, and higher 
moments, a nd equations are der i ved that describe the change of these moments 



over time (jPamilo et al.l . 119871 : iHiggs and Woodcockl . Il995l : iPriigel-Bennett 



19971 ) . A disadvantage of this approach is that it generally requires an arbitrary 
cutoff condition to truncate the moment expansion, and that the equations 
become quickly untractable if higher moments are included. 

The rapid ratchet region described in the present wor k corresponds, mostly, 
to no smaller than 1 (Fig. 5c). For small no, the work by lGesslerl (|l995l ) is gen- 
erally regarded as giving accur ate estimates for the ratchet rate. However, it 
is important to emphasize that iGesslerl (119951 ) did not actually derive a closed 
form expression for the ratchet rate. A key parameter in his result needs to 
be determined numerically by iteration. Moreover, several of his expressions 
were not derived from first principles, but chosen on the basis that they pro- 
vided a good fit to simulation results. In contrast, our expression Eq. (136|) was 
derived from first principles and is a simple, closed-form expression that can 
be plotted easily, if we interpret it as describing iV as a function of v (rather 
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than f as a function of A^). 



The travehng wave approach also provides a convenient framework to study 
the ratchet speed in the presence of beneficial mutations, or, conversely, the 
speed of adaptation in the presence of deleterious mutations. All that is re- 
quired is to keep the full expression 0(0) — 4>{xo), Eq. flTTl) . instead of its re- 
spective asymptotics. While we did not discuss these cases here in detail, the 
correspond ing calculation s are s traightforward and lead to closed-form expres- 
sions [see (IRouzine et all l2003l ). Supplementary Text]. In comparison, other 
approaches may lead to similarly accurat e expressions, but us u ally d o not 
yield closed-form expressions. For example, iBachtrog and Gordd (120041 ) stud- 
ied the effect of beneficial mutations in the presence of Muller's ratchet. They 
used a Poisson approximation for the steady-state distributions of mutations 
in populations undergoing Muller's ratchet, explicitly modeled the behavior 
of beneficial mutations arising in all possible mutation classes, and used the 
result ing fixation rates a s corrections for the ratchet rate derived by iGessler 
( 1995 ) for no < 1 and by Gordo and Charlesworth ( 2000a| jbl) for no > 1. This 
approach leads to accurate results in a wide parameter range, but the final 
equations contain sums over all mutation classes, and these sums cannot be 
evaluated analytically. 



Finally, with the traveling wave approach, it is also possible to consider the 
general case when both beneficial and deleterious mutations are equally im- 
portant, including steady state at finite iV, v = in Eq. (ITTI) . which differs 
from equilibrium at infinite (IRouzine et al.l . 120031 ). However, to achieve a 
high accuracy in this case requires some additional work on the stochastic 
edge condition. 



5.2 Adaptation 



In an asexual population, the speed of adaptation does not grow linearly 
with for large A^, because beneficial mutations that arise contemporane- 
ously in independent genetic backgrounds cannot recombine, and, therefore. 



tually go to fixation ( 


Fishei . 


1930; 


MuUei 




1932 


Hill and Robertson. 


1966; 


Mavnard Smith. 


1971). 



Crow and Kimura. 1965 



One reason why a benefi- 
cial mutation may not go to fixation is interference with another mutation 
with a larger beneficial effect that ar ises either shortly before or shortly af- 
ter the first mutation (IBartonl . Il995l ). Recently, a theory that predicts the 
evolutionary rate in large asexual populations in the presence of this interfer- 
ence process has received considerable attention (clonal interference-theory, 
Gerrish and Lenski . 1998 ; Orr . 2000l ; Wilke, 2004). However, interference does 



not necessarily occur in the presence of a single alternative mutation. In fact. 
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multiple beneficial mutations arising in the already existing clones — that 
have moderate individual effect but large combined effect — can rescue clones 
with a smaller beneficial effect and partly resolve the clonal interference. The 
clonal interference theory neglects this effect and predicts that the substitution 
rate rapidly approaches a constant with incre asing N (und er the assumption 
of exponentially distributed beneficial effects, IWilkd . 120041 ). and that further 
increases in the speed of adpatation can only come from fixation of muta- 
tions with increasingly larger beneficial effects. This prediction is, most likely, 
not entirely correct, because an increase in population size also increases the 
chance that multiple mutations occur in the same clone. In general, we expect 
the clonal interference theory to have a very limited range of applicability. If 
beneficial mutations are rare, interference is irrelevant to the speed of adapta- 
tion, and V is proportional to A^t/b- On the other hand, if the population size 
is large and beneficial mutations are frequent, then multiple mutations within 
a single clone should be frequent as well, and we expect the current clonal 
interference theory to underestimate the speed of adaptation in this regime. 



In contrast to the clonal interference approach, the traveling wave approach 
considers multiple mutations. However, the present model does not consider 
interference from mutations with a larger beneficial effect, because all mu- 
tations are assumed to have the same selection coefficient. (Note that, as a 
consequence, the substitution rate and the speed of adaptation are essentially 
equivalent in our model, as they differ only by a constant factor s, whereas 
they are distinct quantities if there is variation in the selection coefficient, as 
is the case in the clonal interference models.) Because of these differences in 
model assumptions, we cannot make a quantitative comparison between the 
two theories. However, we can observe an important qualitative difference: in 
the traveling-wave theory, the substitution rate keeps slowly increasing with 
population size until extremely large population sizes. 



For very large population sizes, such that In ~ fcav ln(s/yUb), the substitution 
rate becomes on the order of sfcav At this point, the tail of the wave reaches 
the best possible geri ome, k = 0, and the traveling wave theory breaks down 
(IRouzine et al.l . l2003l ). At higher population sizes, the one- locus model applies, 
and the value of the substitution rate saturates at the infinite-size value. Thus, 
we predict a continuous transition from the finite-population case in which 
linkage slows down the speed of adaptation to the infinite-population case in 
which l inkage has no ef f ect on the speed of adaptation in the multi plicative 
model (IMaynard Smithl . Il968l ). By contrast, iMaynard SmithI (Il97ll ) argued 
that a finite asexual population is always affected by linkage, regardless of its 
size. We must mention, however, that, for reasonable parameter settings, the 
population size at which the crossing over to the fully deterministic behavior 
occurs is unbiologically large. For example, assuming k^v = 20 and s/fih = 
1000, we find that A^ has to be on the order of 10^°, and this number grows 
rapidly for larger /c^v 
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It is important to stress that the speed of adaptation is, in general, not gov- 
erned by the total supply of beneficial mutations, A^f/b, because N and en- 
ter the expressions for the speed of adaptation indepe ndently and in d i fferen t 
functional forms. This fact had already been noted by lMaynard Smith! (jl97ll ). 
and its implications for experimental tests of the speed of adap ation have be en 
discussed recently in the context of clonal interference theory (jWilkd . 12004 ) . 



Recently, iDesai and Fisherl (120071 ) have proposed an alternative method to 
derive, for the same initial model, the speed of adaptation in a large asexual 
population in the absence of deleterious mutati ons. We discuss this method 
in detail and extend its validity range elsewhere (IBrunet et al.l . 120071 ). Briefly, 
these authors replace the full population model with a simplified model in- 
cluding only two fitness classes, the least-loaded class and the next-loaded 
class. The next-loaded class is assumed to grow exactly exponentially, as if 
under selection alone. The click time 1/V is determined by extrapolating the 
growth of the least-loaded class back from infinite time. As in our work, only 
the least-loaded class is treated stochastically. 



The main validity conditions for this approach follows. The next-loaded 

class can be treated deterministically if s\xo\ ^ Ub, which is equivalent to 
V ^ t/fe, the sa me condition that we u se in our work. (Note that inequality 
s given bv lDesai and Fishei] fl2007h was obtained for |xo| ~ 1; for general 
|xo|, the condition is as we state.) The back-extrapolation procedure is valid 
at moderate populat ion sizes or adaptation rates, such that In |xo| -C ln(s/f/b) 
(IBrunet et al.l . 120071 ). A sufficient validity condition is |xo| <^ \n.{s/Ub), which 
is equivalent to V ^ s (note that the cited paper contains a typo in that 
condition; M. Desai, pers. comm.). The accuracy of the replacement of the full 
population model with the two-class model and of the actual dynamics of the 
next-loaded class with an exponential is less clear. New beneficial mutations 
into the class create a time-dependent prefactor at the exponential. Yet, it is 
quite possible, that at moderate |a;o| or V where the cited approach is designed 
to apply, the approximation works well. We tested the num eric accuracy of 
the substitution rate predicted by iDesai and Fished (120071 ) with the use of 
Monte-Carlo simulation of the full population model in different parameter 
regions. Within its validity range, their result agrees with simulation within 
10-20% and is very similar to our result (Fig. 7C). [Note that the extremely 
high numeric accuracy shown in Fig. 5 of the cited work is due to replacement 
of the actual final result, Eq. (39), with its asymptotics for very large A^, Eq. 
(40), which does not apply at such moderate A^.] 



In their Discus sion section, iDesai and Fisherl (120071 ) state that the traveling 
wave approach Rouzine et al. ( 20031 ) requires that the fitness distribution is 
smooth in mutational load and, hence, broad, which takes place at large sub- 
stitution rates, V ^ s. Proposing some estimates of parameter values of Ub 
and s typical for yeast, they argue that this regime corresponds to huge pop- 
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ulation sizes and conclude that results by iRouzine et al.l (120031 ) are relevant 
for viruses only. In fact, the traveling wave approach requires that the loga- 
rithm of the fitness distribution, not the fitness distribution itself, is smooth 
in the mutational load, which takes place when the high-fitness tail of the 
distribution |xo| (not the half- width) is larger than 1. The condition is met 
at much sm aller substitution rates, V ^ s/ ln{V/ Ub), which is the low bound 
assu med bv iDesai and Fisher fl2007f ). The likely reason for the confusion is 
that iRouzine et al.l (120031 ) normalized the fitness distribution assuming it is 
broad, which corresponds to the case V ^ s. When the wave is narrow, which 
happens in the interval s/ln{V/Ub) '^V <^ s, the normalization condition is 
simply replaced with 0(0) = (Subsection 2.4). Except for the trivial change 
in normalization, whether the wave is broad or narrow does not matter. In 
the final expression for the substitution rate, the difference between the two 
cases s/ \ia(y/Uh) -C -C s and s is in a constant factor at N, Eqs. ( 1521) 
and (1511) . Numerically, the two expression are quite similar in the parameter 
range we test. Therefore, the results of the traveling wave approach apply in 
both intervals of the substitution rate and are potentially relevant for a broad 
variety of asexual organisms, including RNA and DNA viruses, yeast, coral, 
and some species of plants and fish. 



To summarize, we have applied the semi-deterministic traveling wave approach 
to derive the rate of Muller's ratchet and the speed of adaptation in a multi- 
site model of an asexually reproducing population. Our results underscore the 
importance of a proper stochastic treament of the best-fit class. Our method, 
based on a continuous approximation to logarithm of the fitness distribution 
of the other (deterministic) classes, ensures fair accuracy in a very broad pa- 
rameter range. In addition, an analytic correction for the discreteness of fitness 
of the deterministic classes significantly increases the accuracy of predictions. 
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A Appendix 

A.l Main simplifications of the derivation 

Below we list the main simplifications employed in the derivation of the travel- 
ing wave solution and determine parameter regions in which they are asymp- 
totically correct. A brief summary of the validity conditions is as follows. The 
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values of s, U, and Ub are assumed to be much smaller than 1. The number of 
sites is large and the wave is far away from the two extremes of a mutation-free 
or a completely mutated genome, L 3> 1, fcav S> \xo\,L — fcav ^ |a;o|- Other 
conditions depend on the case. 

For Muller's ratchet, we assume s <^ U <ti 1 and ^ l/(f/cr^/^). These 
conditions ensure that the high-fitness tail is long. Also, for the ratchet to 
proceed rapidly and continuously rather than by exponentially rare clicks, we 
need ln(iVs3/2[/V2) <g [//^ (Section 3.3). 

For the adaptation regime, to have a single stochastic class, we assume V ^ 
Ub, which is equivalent to Ub -C s|xo|. To expand fitness in the mutational 
load, we assume s|a:o| -C 1. We also need a long tail of the fitness distri- 
bution, |xo| ^ 1. In terms of the substitution rate V, the last two con- 
ditions read s/ IniV/Ub) ^ V <^ 1/ ln(l/[/{,). In terms of A^, they read 

N > ^/^/Hs/Ub) and IniN^/dTb) < (1/s) ln\l/Ub). 

Simplification 1: Linear selection term. We can expand e~*^ under 
the condition s\x\ -C 1. In the case of Muller's ratchet, we find from Eq. (!2T!) 
that this condition is equivalent to -C 1. In the case of adaptive evolution, 
we find from Eq. (!39ll that this condition is equivalent to V ^ l/ln(l/f/fe), 
which is met numerically in all cases we consider (Fig. [Tj) and analytically at 
In(Arv^) < il/s) \Yi\l/Ub), Eq.(l53]). 

Simplification 2: Distribution is far from the origin. Because au depends only 
slowly (linearly) on fc, we are allowed to replace by a = a^^^ under the 
condition that L ^ 1, fcav ^ L — k^^ ^ \xq\ (i.e., the wave is far away 
from the two extremes of a mutation-free or a completely mutated genome, 
and narrow in comparison to the average mutational load). 

Simplification 3: Continuity in time. Approximating fk{t + l)/fk{t) with 1 + 
din fk{t)/dt is justified when \d\nfk{t)/dt\ -C 1. Replacing \dt\ and In fk{t) 
with \dx/{vU)\ and 0(x), respectively, we find that this condition takes the 
form U\v(j)'{x)\ -C 1. Because increases with \x\, the far low- fitness tail is 
not very important for evolution, and in the high-fitness tail |0'(a;)| < |0'(a;o)|, 
the sufficient condition takes a form f/|f|lnM ^ 1, where u is defined in 
Eq. ([13]). 

In the ratchet limit, a = 0, after substituting u from Eq. fl2Ul) . the validity 
condition becomes Uvln{l/v) <^ 1, which is met at any f , < f < 1, as long 
as t/ ^ 1. Note that the condition is equivalent to 5 ^ 1, as in Simplification 
1. 

In the adaptation limit, where deleterious mutations can be neglected, the 
validity condition takes a form V\n{V/Ub) -C 1, where V = —Uv and we used 
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Eq. (l38l) for u. Again, the validity condition is \S\ ^ 1, which is approxi- 
mately equivalent to the condition s|xo| -C 1 we used to expand fitness in k 
(Simplification 1). 

For sufficiently (and unrealistically) large N, V becomes large as well, and 
the condition 15*1 ^1 may be violated. In this region, technically, we can 
neither expand fitness in k nor replace discrete time with continuous time 



(Simplifications 1 and 3 cannot be used). Nevertheless, iRouzine et al.l (120031 ) 
showed that the continuous equation for the fitness distribution, Eq. (jlj), can 
be derived in a more general way, without assuming \d\n fk{t)/dt\ <^ 1, nor 
expanding fitness in k, nor neglecting multiple mutations per genome per 
generation [see the transition from Eq. (1) to Eq. (11) in the Supplementary 
Text of the quoted work]. In the present work, we use these approximations 
to simplify our derivation. 

Simplification 4- Continuity in k. Replacement of \n[fk+i{t) / fk{t)] with the 
partial derivative din fk{t)/dk is justified ii\d\nfk/dk\ \d'^\n fk/dk'^\. With 
din fk/dk = 0', we can rewrite this condition as |0'| \dx/d(f)'\ ^ 1. This condi- 
tion is already discussed in the section on the continuity correction for Muller's 
ratchet, and also in the section on the speed of adaptation, after Eq. (HUj) . In 
short, this simplification is justified if the left tail is long, |xo| ^ 1. Let us 
express this condition in terms of the substitution rate and A^. 

For Muller's ratchet, based on Eq. fl?Il) . the condition |xo| ^ 1 is met if 
(T ^ 1, and the ratchet rate is not to close to the maximum, l~v ^ ^/a. The 
latter condition is always met, because the more limiting condition is that the 



wave tail is longer than the half- width, |xo| ^ yVar[A;], which is equivalent 
to the requirement 0(0) — 4>{xo) S> 1. In terms of v, that condition reads 
1 — w ^ a^^^, Eqs. fl23|l and fl2T]) . The lower bound on follows from the final 
relation between A^ and v, Eq. (1361) . which yields A^ ^ l/(f/cr^/^). 

For adaptation, the condition of a long high-fitness tail can be found from Eq. 
fl39l) for |xo|, as given hj V ^ s/ lia{V/Ub) ~ s/ \n{s/Ub). The lower bound on 
A^ follows from the final result for the adaptation speed, Eq. fl52l) . as given by 



Ar> y^/ln(sM). 



Simplification 5: Slow change of the wave shape. In the derivation of Eq. ([6]), 
we neglected any change in the wave shape over time, i.e., we assumed that 
\d(j)/dt\ <C 1. This condition is satisfied wheii the d istribution is far from 



the origin. Simplification 2 [see (IRouzine et al.l . l2003l ). Supplementary Text, 
Approximation 6 for details]. 

Simplification 6: Broad and narrow wave. If the wave is broad, Var[A;] ^ 1, to 
determine 0(0), we can expand e^*^'^^-* near the wave center linearly as l±0'(x) 
if |0'(x)| -C 1. This condition is met for any t>, whether positive or negative. 
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if 0" ^ 1 (i.e., s ^ f/). If cr is not small, the condition is is met at large 
negative v (rapid adaptive evolution): \v\^ a oi V ^ s.\n the opposite limit 
of narrow wave, we have Var[/c] <^ 1, so that 0(a;) is narrow and cannot be 
approximated by a Gaussian even near its center. In this case, we use instead 
0(0) = 1. Together, these two limits cover the entire parameter range. 

Simplification 1: Single stochastic class in the adaptation case. Now we verify 
the initial approximation that the next-loaded class, k = ko — 1, can be treated 
deterministically. The average ratio of its size to the least-loaded class size is 
given by fko-i/ fko = exp[0'(a;o)] = u = V/Ub, Eq. fl5Sl) . Thus, the condition 
that only one class has to be treated deterministically is V ^ Ub. The strong 
inequality also ensures that, when a best-fit class gives rise to a new best- 
fit class, the former is much higher than the stochastic threshold, /max ^ 
l/(A^s|xo|), Eqs. (|46D and (|39|). 



A. 2 Connection between the width and speed of the wave — alternative deriva- 
tion 



We derived the maximum ratchet rate (f < 1 — 2a) from the variance of 
k, Eq. (fTTl) . which in turn we derived from the normal approximation to the 
shape of the wave, Eq. ([8]). We can alternatively derive the same result directly 
from Eq. We write /^(t + 1) - /fc(t) = Udfk{t)/dT, multiply both sides of 
Eq. ([3]) with {k — kg_^)lU, and sum over all k. We find 



dfk{t) 



^av;— ^ — - (1 - a) E(/c - k^^)fk-i{t) + aY^{k - A;av)/fe+i(t) 

k '-^'^ k k 



J2ik - A;av)[l + aik - k^,)]fkit) . (A.l) 



The term on the left hand side of Eq. flA.ip evaluates to 
k k ^' k 

dka^r 



dr 



(A.2) 



The first term on the right-hand side of Eq. (lA.ip evaluates to 



;i - a) E(fc - K.)fk-i{t) = (!-«) - W + (1 - fcav) E fk~iit) 

k k k 

= (l-a). (A.3) 



Likewise, 



aJ2ik - k^^)fk+i{t) = -a 



(A.4) 
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and 

J2ik - + aik - h,)]Mt) = cjY.ik- k^^fUt) . (A.5) 

k k 

Since J2kik — kgjvY fk{t) = Var[A;], we find 

v = l-2a- aVar[fc] , (A.6) 
which corresponds to Eq. (fTTI) . 

A. 3 Correction to the continuity approximation for the case of Muller's ratchet 

We wish to compute the steady state proportions fkif) from Eqs. (l32l) and (!33l) . 
under the periodic boundary condition 

/,_i(0) = /4l/(f/i;)]. (A.7) 
We introduce the generating function 

G{\t) = Y.fk,UtW- (A.8) 

n>0 

From Eqs. fl32l) and fl33|) . we obtain 

|-G(A, t) = -SG(A, t) + U\G{\ t), (A.9) 

so that 

G(A,t) =G(A,0)e(^^-^)*. (A.IO) 
We find ^(A, 0) from Eq. (^KT^ and Eq. ([32]) 

-ie'^GCA.Oj-i/feWe-*. (A.ll) 
Finally, using the fact that S = Uv\ia{e/v), we find 



G(A,0) = ^AW = AR. (A.12) 

6" — Aec" 6" 



The generating function G{X,t) contains all the information about the fk{t). 
By expanding Eqs. (1A.10|) and (]A.12p in powers of A and identifying the terms 
with the coefficients in Eq. (lA.Sp . we recover all the functions 

/*.(«) = A«(0)e-«. A„+,(«) = A„(0) e-^', (A,13) 

f.Mt) - ;,(o) (^'")V^ + ^"'(--/)-^-^-^"+^/V -.. ,A.14) 
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etc. We obtain the large n behaviour of fko+n from the singularities of ^(A, t): 
The first singularity is for A = f, so that fko+n(t) must increase like 1/f" 
for large n. More precisely, to obtain the asymptotic behavior of fko+nit), "we 
study the divergence of G{X,t) around A ~ f . We find 



~~E. x" 
— e X - 

V 



fkoiO)e 



(Uv-S)f_ 



(1/3) -Uvt , 



e\{l-X/v)^ 1-X/v 



(A. 15) 



where the remaining part ^jj{X, t) has no singularity at f = A. Using X]n>o 
1/(1 - x) and En>o(^ + = 1/(1 " we find 



(A. 16) 



where 5n{t) is defined by V^(A,t) = T.n>Q ^nif)X^ ■ Except for A = f , where 
ilj{X,t) is finite, the singularities of '?/'(A,t) are the same as the singulari- 
ties of G{X,t). We checked numerically that the next singularity is at A = 
8.07557e^'^^'^'^^^v. Therefore, 6n must decay faster than 1/(817)" for large n. 
Finally, for large n, 



h,+n{t) = /fco We"""*^ ln+^-Uvt + 0{l/8") ). (A.17) 



ev" 



We can now compare the proportions fk at, for instance, time t = l/{2Uv) 
(that is, in the middle of the cycle). 



In fko- 



f 1 



\2Uv 



2Uv 



— nlnf +ln 



2 / 5 



+0(l/8"), (A.18) 



or, alternatively, compute the average of In fk over one cycle: 



ln/fco+n(^)-ln/fco(i) = -nlnt;+ln 



2 / 4 



+ {n+- I In 



1+ 



1 



n + 



-l+C(l/8") 



(A.19) 

Both these asymptotic expressions give surprisingly good results already for 
n > 1. To order 1/n, the right hand sides of Eqs. (lA.lSp and (1A.19P are 
identical, and are given by 



2 \ 5 

nlnf + Inn + In ( — H h Oil/n^) . 



(A.20) 
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A. 4 Correction to the continuity approximation for the case of adaptive evo- 
lution 



As explained in the main text, deleterious mutation can be neglected for large 
V, V ^ U. We start from Eq. Q, which we write as 

fk{t + 1) = fk{t) + Utfk+i{t) - Sfk{t) for k > ko, (A.21) 

where S = U + s{k — /Cav)- For k < kQ, we have fk{t) = 0. Strictly speaking, 
at each time step t + 1 a proportion Uhfkoif) of the population arrives at site 
ko — 1. Yet these individuals are immediately removed from the population 
by genetic drift, because their number is too small. However, as time goes on, 
/fcg it) grows, and at some point the number of individuals reaching k^ — 1 \s 
large enough to survive genetic drift. A new site is occupied and the value of 
fco decreases. 

Let t = is the time at which a new clone is established at k^. The average 
time the traveling wave moves one notch in /c is 1/1^, where V is the average 
substitution rate. We approximate the process of adding new best-fit clones 
by a periodic process, i.e., a new clone will be established at site /cq — 1 after a 
time interval of length a clone at site ko — 2 after another interval 1/V, 
and so on. Then we have 



fk{l/V) = fk+m for k > ko. (A.22) 

From the reasoning about the stochastic cutoff in the case of adaptive evo- 
lution, we know that a new clone starts to grow deterministically once it 
exceeds the characterstic threshold fk{t) ~ 1/{\S\N). Furthermore, we know 
from Eq. fH^ that the adjacent class at this point in time is of size fk+i{t) ~ 
l/{2NUb). Thus we write 

fkoiO) = j^, fMV) = fko+m = ^ . (A.23) 

where C ~ 1 is an undetermined numeric constant that, as we show below, 
does not affect much the final result in the parameter range we study. 

By analogy with the derivation of the wave equation, we replace the difference 
equation (1A.21I) by a differential equation in t. However, we do not make the 
continuity approximation in k. As in the case of the continuity correction for 
Muller's ratchet, for k close the edge of distribution, we assume that 5* is a 
constant independent of k, S = U + sxq, where Xq is given by Eq. fl39l) . Thus, 
we use 

dfkit) 



dt 



UJk+iit) - Sfkit) for k>ko. (A.24) 
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We look for solutions of the form 

/,„+„(t)=^a,(t)e-^". (A.25) 



From Eq. (]A.24|) . we obtain 

a^{t) = aA(0)e(^''^"'-^)* . (A.26) 
From the periodicity condition, Eq. (lA.22p . we further obtain 

aA(0)e(^''^"'-^)/^ = aA(0)e-^ , (A.27) 
so that, for each A, either ax{0) = 0, or 

S = Ube-^ + XV . (A.28) 

Therefore, 

^o+n(t) = E«A(0)e-^("+^*). (A.29) 

A 

where the sum is over A that satisfy Eq. flA.28p . 

To make any further progress, we have to find the values of A. The function 
Ufje"^ + AV^ has the only minimum —V[ln{V/Ub) — 1], which is reached for 
A = —\n{V/Ub). From Eq. (15^ and the definition of S, we see that S = 
—V[\n(y/Ub) — l] + U. Since, in the adaptation limit, we assume that V ^ U, 
Eq. flA.28P will be satisfied for two values of A that are slightly larger and 
slightly smaller than —hi[V/Ub), as given by 



A+ = - HV/Ub) + pU/V , (A.30) 
A- = - \n{y/Ub) - ^lUjV . (A.31) 

If we now insert the expressions for A^ and A~ into Eq. (]A.29P and expand to 



first power in JUb/V^ we find 



fko+n(t) 



A + B{n + Vt) 



where we have introduced A = ax+{0)+ax-{0) and S = [a^- {0)—ax+{0)]y 2U /V . 
Note that we will not need these specific expressions for A and B. Instead, we 
derive A and B directly from Eq. ( ]A.23ll and obtain 

A--^ B- ^ 1^1-2^^ (A33) 
^~\S\N' 2\S\V ■ ^^-^^^ 



Putting everjd;hing together, we can obtain the fk{t) at mid-period [t = 
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{2V) as given by 



In (i^) - In (^) = \n{V/U,) + In 



l + 2n- 



8CV 



S\ + 2CV 



n 



(A.34) 

If IS"! ^ which is the case when \n{V/U\)) ^ 1, we can write the second 
term on the right-hand side as 



In 



2n + 1 + C 



ln(y/C/b) 



(A.35) 
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Symbol Definition 



ccfe ratio of the beneficial mutation rate to the total mutation rate in 

class k; at = f-i-hk/U 
a effective ratio of the beneficial mutation rate to the total mutation 
rate; a = Hhkav/U 

fk{t) frequency of a class of genomes with mutation number k at time t 
(p logarithm of genome frequency; = In / 

k the number of less-fit alleles in a genome, as compared to the best 

possible genome 
ko minimum value of k in the population 
fcav effective mutational load generating a fitness equivalent to 

the mean fitness of the population, kgy = — -j ln(^j, e"^*/^) 
L length of genome (number of sites) 
jj, mutation rate per site 

haploid population size (number of genomes) 
probability that a class of genomes has frequency / at time t 
s selection coefficient; relative fitness gain/loss per mutation 
5" effective coefficient of selection against the best-fit class in a population; 

S=U + s{ko - fcav) 

a rescaled selection coefficient; a = s/U 
t time (in generations) 

U effective mutation rate per genome per generation; U = jiL 

beneficial mutation rate per genome per generation; 1/^ = {k/L)U 

u u = e'^'(^o) 

V average substitution rate of beneficial mutations; V = —dkaf,{t)/dt 

V normalized ratchet rate (substitution rate of deleterious mutations); 

V = {l/U)dk^y{t)/dt 
Var[fc] variance of k in the population 

X relative mutational load of a class; x = k — /cav 
xq minimum value of x for highest-fitness class 



Table 1 

Variables used in this work. 



45 




Fig. 1. Schematic illustration of the solitary wave profile. The stochastic edge is 
the minimum-/^ boundary of the population's mutant distribution. There are no 
genomes with fewer than ko mutations in the population. The speed of the wave 
is primarily determined by how fast mutations are gained or lost at the stochastic 
edge. 




Fig. 2. The minimum of the function x{(p') determines the location of the stochastic 
edge, ko. 
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Fig. 3. Schematic illustration of the loss of best-fit class. The frequency of the least- 
-loaded class, fkoit), decays approximately exponentially in time, until it reaches 
the stochastic threshold 1/{SN). At the stochastic threshold, the best-fit class is 
rapidly lost (in time ~ 1/5'). 




Fig. 4. Schematic illustration of correction for discreteness. Our continuous treat- 
ment of fitness classes predicts that In fj^ (t) — In f^^ (t) grows linearly in k — ko near 
the high- fitness edge, k — ko \xo\. Discreteness of k introduces a correction near 
the edge, approximately equal to \n[{2/-s/e){k — fco -|- 5/6)]. 
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Fig. 5. Normalized ratchet speed as a function of population size: analytic re- 
sults versus simulation. Symbols correspond to simulation results, and lines corre- 
spond to theoretical predictions. The simulation results were obtained as described 



ipi 

(jRouzine et al.l . |2003| ). Beneficial mutations are absent (a = 0). (A) Results for 



s/U = 0.1. The solid blue and red lines follow from the present work, Eq. (I36p . 
The dotted green lines follow from Gordo and CharlesworthI (|2000al ). Eqs. (3a) and 
(3b). The dashed purple lines follow from Landel (jl998l ). Eq. (2c) times NU. Pa- 
rameters are sho wn. (B) As (A ), but for a = s/U = 0.01. The dashed purple lines 
follow again from iLandel (Il998l ^: the equivalent of the green lines in (A) falls out- 
side of the axis range. (C) Effect of discreteness correction. The red squares and 
circles are identical to those in (A) and (B). The solid lines correspond to the pre- 
diction of traveling- wave theory without the discreteness correction, i.e., without a 
factor of a inside of the logarithm on the left-hand-side of Eq. (j36l) . and without 
the denominator 1 — vln(e/v) + 5(t/6 inside the logarithm on the right-hand side of 
Eq. (j36p . The dashed lines correspond to Eq. (|36p without the entire second term 
on the right-hand-side. The arro w shows the value of at which the size of the 
least-loaded class no = Ne~^/^ ( Haigh . 19781 ) in the equilibrium distribution is 1 
(at s/U = 0.1). 
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Fig. 6. Schematic illustration of establishment of a new best-fit class. Once a ben- 
eficial mutant has survived drift, its frequency grows approximately exponentially 
in time. A further beneficial mutation is likely to arise and survive drift only in a 
short time interval of length l/S around the time when the currently least- loaded 
class has grown to its maximal value /max- 
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Fig. 7. Normalized substitution rate as a function of population size, in the absence 
of deleterious mutations. Symbols correspond to simulation results, and thick solid 
lines correspond to analytic predictions obtained (A,B) from Eq. (j5ip or (C) from 
Eq. (j52p . Thin lines in (A,B) indicate the analytic prediction without discreteness 
correction, i.e., without term ln[(y/s) ln(y/f7b)] on t he right-hand side of Eg . (I5ip . 
Dotted lines in (B,C) indicate the analytic result by Desai and Fisher ( 20071 ). Eqs. 
36, 38, and 39. Their result shown in (B) is outside of its designed validity range and 
is shown for r eference only. The sim ulations were carried out (A,B) in discrete time 
as described ( Rouzine et al. . 20031 ) or (C) in continuous time. V was measured in 
simulation as the average slope of ka_v{t) in the time interval [0.85to, l-15to], where 
the time to of the interval center was determined from the condition A;av(io) = 250. 
The initial /cav(O) was set to 1.5kav{to). The per-site beneficial mutation rate /i was 
chosen such that the genomic beneficial mutation rate C/b had the value shown on 
the plot at time to, Uy^ = ^/cav(io)- The size of the symbols roughly corresponds to 
the largest standard deviation of the mean speed estimates. 
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Fig. 8. Test of the accuracy of the stochastic edge treatment, Eq. (|36]) . Points are ob- 
tained by simulation using a simphfied two-class model. Only points corresponding 
to s|xo| < 0.2 are shown. Parameter values are shown. 
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